Bayesian brain mapping (BBM) is a technique for producing individualized functional brain topographic maps from existing group-level network maps or templates. Those templates take the form of either a group parcellation, a set of group ICA maps, or other types of network maps like PROFUMO or NNF. BBM is a generalization of template ICA (Mejia et al. 2020), in which the template is a set of group ICA maps. In BBM, the template is not used directly in the individual-level model, but rather forms is the basis of population-derived priors, which are used in a Bayesian model. The priors are derived from a training set of subjects from a representative population, either using holdout data from the focal study or using data from a publicly available neuroimaging repository such as the Human Connectome Project (HCP) (CITE) or Alzheimer’s Disease Neuroimaging Initiative (ADNI) (CITE).
The Bayesian model in BBM is a single-subject ``hierarchical’’ source separation model, with parameters representing the spatial topography of different networks and their corresponding temporal activation profiles. The model is hierarchical in the sense that it includes population-derived priors on those parameters. While hierarchical models typically require multi-subject data to establish the shared prior parameters, in BBM the model is fit to a single subject using priors that are already established. This allows BBM to be highly pragmatic, computationally efficient, and potentially clinically applicable.
The use of population-derived priors in BBM reduces noise while retaining relevant signal. This results in more reliable individual-level network topography maps and the functional connectivity between them (Mejia et al. 2020) (also CITE Mejia 2022 and 2025). An important feature of the BBM priors on spatial topography is that they spatially vary within each network, which allows individual differences to be expressed where they exist while reducing noise in background regions. Furthermore, due to the continuous, overlapping nature of BBM network maps, noise reduction in background regions of one network indirectly contributes to estimation of signal in the same region in other networks. The powerful noise-reduction properties of BBM allows it to produce reliable maps of individual functional topography with moderate scan duration, without the need for ``dense sampling’’ of individuals.
There are two steps to performing BBM, which are implemented in the
BayesBrainMap R package: (1) prior estimation and
(2) model fitting (Figure @ref(fig:model-overview)). Here, our
primary focus is creating and sharing population-derived priors for
using data from the HCP, based on various templates. These priors can be
directly used to perform BBM in studies of healthy young adults. To
perform BBM to study individuals from other populations, or using other
templates, we provide and describe the code used to produce the
HCP-derived priors, so that this workflow can be easily reproduced using
other datasets. We also provide guidance for visual investigation of the
priors to ensure quality.
Finally, we illustrate the model fitting step using an individual from the HCP and individuals from an external study, the midnight scan club (MSC) (CITE). The MSC focuses on a similar population as the HCP but differs in acquisition methods. Since the population-derived priors used in BBM encode the distribution of signal, not noise, priors derived from one study can be applied to other studies with differing noise properties, as long as the populations are similar. We will illustrate this through application of HCP-derived priors to data from the MSC. Application to the MSC will also exemplify the ability of BBM to reveal individual features of functional topography from a single session of data, without the need for dense sampling.
Overview of the Bayesian Brain Mapping Workflow
Here we provide a brief overview of the BBM statistical model. For a given subject and fMRI session, let \({y}_{tv}\) be the preprocessed BOLD fMRI time series at voxel or vertex \(v\) and time point \(t\). The BBM model assumes that the BOLD time series can be decomposed into contributions from a set of networks, similar to ICA. The first level of the model is given by
\[ {y}_{tv} = \sum_{q=1}^{Q} a_{tq} s_{qv} + {e}_{tv} = \mathbf{a}_t^\top \mathbf{s}_v + {e}_{tv}, \quad {e}_{tv} \sim N(0, \tau^2_v), \]
where \(s_{qv}\) is the spatial engagement of network \(q\) at voxel or vertex \(v\), and \(a_{tq}\) is the temporal activation of network \(q\) at time point \(t\). The vectors \(\mathbf{a}_t\) and \(\mathbf{s}_v\) combine those values across all \(Q\) networks.
The second level of the model incorporates the population-derived prior on the spatial topography in \({s}_{qv}\), as described in [CITE Mejia 2020]:
\[ s_{qv} = s^0_{qv} + \delta_{qv},\quad\delta_{qv} \sim N(0, \sigma^2_{qv}), \] where \(s^0_{qv}\) and \(\sigma^2_{qv}\) are known via the prior, and the deviation terms \(\delta_{qv}\) represent individual differences between the subject and the population average. Optionally, spatial dependencies in \(\delta_{qv}\) can be modeled via a spatial prior for additional accuracy and power, though at a higher computational cost (CITE Mejia 2022). Note that structured noise may exist in the data that is not well-captured by the residual noise term. While in ICA it is common to model structured noise as components, in BBM these are not included in the model because can be removed beforehand as described in [CITE Mejia 2020] or via other methods like ICA-FIX.
The third level of the mdoel incorporates the population-derived prior on the functional connectivity between networks, which is represented by the covariance of \(\mathbf{a}_t\). This is accomplished by assuming a multivariate Normal prior on \(\mathbf{a}_t\) with mean zero and covariance \(\mathbf{G}\), and assuming a population-derived hyperprior on \(\mathbf{G}\):
\[ \mathbf{a}_t \sim N(\mathbf{0}, \mathbf{G}), \mbox{ where } \mathbf{G} \sim p(\mathbf{G}) , \]
See [CITE Mejia 2025] for details on the population-derived prior on the functional connectivity. Briefly, two choices exist for this prior: the conjugate Inverse-Wishart distribution, or a novel Cholesky-based prior developed for this model, which involves sampling. The former is faster, while the latter has somewhat better performance but is more computationally intensive.
Here, we build HCP-derived priors for Bayesian brain mapping using several different choices of template, including two different parcellations and group ICA maps (Table @ref(tab:template-summary)). Specifically, we use the 17-network Yeo parcellation (CITE), the MSC group parcellation (CITE and get the real name), and the HCP group ICA maps with 15, 25, and 50 components. For each template, we build priors with and without global signal regression (GSR), since whether or not to perform GSR is an important choice that remains the subject of debate.
Since the functional MRI data in the HCP was acquired with two different phase-encoding directions (left-to-right, LR, and right-to-left, RL), we build the HCP-derived priors using both phase encoding directions separately, as well as a combined version. Comparing the LR and RL priors allows us to assess the impact of phase encoding direction on the priors, while the combined version provides a general purpose prior that is not specific to either the LR or RL acquisition.
Before estimating the BBM priors, we first select a high-quality, balanced subject sample to ensure high-quality and representative priors. Starting from the full HCP sample of \(N=1206\) subjects, we apply several filters to obtain the final sample for the priors. First, we exclude subjects with insufficient scan duration after motion scrubbing using a framewise displacement (FD) threshold of 0.5 mm and dropping the first 15 frames. Second, we exclude any related subjects. Finally, we balanced sex within age groups. See Appendix B for details. After all filters, our final sample contains approximately 350 subjects for each encoding condition (LR, RL) and for the combined dataset, which are used to estimate the priors.
Setup
To reproduce this workflow, first follow the setup process outlined in Appendix A.
Filter Subjects by Sufficient fMRI Scan Duration
See Appendix B.1 and script:
1_fd_time_filtering.R
Filter Unrelated Subjects
See Appendix B.2 and script:
2_unrelated_filtering.R
Balance sex within age groups
See Appendix B.3 and script:
3_balance_age_sex.R
The resulting subject list
(valid_combined_subjects_balanced.rds) is used throughout
the rest of the analysis.
estimate_prior()In this step, we estimate group-level statistical priors using the
estimate_prior() function from the
BayesBrainMap package.
The encoding parameter is set to combined,
LR, and RL to use the final lists of subjects
saved in Step 3.3 (valid_combined_subjects_balanced.rds,
valid__LR_subjects_balanced.rds, and
valid_RL_subjects_balanced.rds). The combined
list includes individuals who passed motion filtering in both LR and RL
directions for both sessions, were unrelated, and were sex-balanced
within age groups. The LR and RL lists include
subjects who met these criteria independently for each direction.
If encoding is combined, we include only REST1 sessions
from both phase-encoding directions:
rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii
rfMRI_REST1_RL_Atlas_MSMAll_hp2000_clean.dtseries.nii
If encoding is LR or RL, we use both REST1
and REST2 sessions from the specified direction:
For LR:
rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii
rfMRI_REST2_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii
For RL:
rfMRI_REST1_RL_Atlas_MSMAll_hp2000_clean.dtseries.nii
rfMRI_REST2_RL_Atlas_MSMAll_hp2000_clean.dtseries.nii
To standardize scan duration and improve data quality, we apply both
initial volume dropping and temporal truncation using parameters handled
directly by the estimate_prior() function from the
BayesBrainMap package.
Specifically:
drop_first = 15 removes the first 15 volumes from
each scan to eliminate early signal instability and motion
artifacts.
scrub defines volumes to exclude after a target
duration. In our case, we truncate data to the first 10 minutes (600
seconds), excluding any volumes beyond that point.
See Appendix C for more details.
We consider two types of group-level parcellations for estimating priors:
HCP GICA parcellation (GICA15.dscalar.nii, etc.),
available in the data_OSF/inputs folder. These files were
downloaded from the HCP website, specifically from the CIFTI
Subject-specific ICA Parcellations dataset for 15-, 25-, and
50-dimensionalities.
Yeo17 parcellation (CITE). For details on how this parcellation was processed and simplified for use, see Appendix D.
Each of these parcellations was used to estimate priors with and without global signal regression (GSR), resulting in eight total priors saved as .rds files. See Table for a summary of the parcellations and GSR combinations.
# This script estimates and saves functional connectivity priors
# for both spatial topography and connectivity.
# It supports both GICA-based (15/25/50 ICs) and Yeo17 parcellations,
# with or without global signal regression (GSR).
# For priors using the "combined" subject list, it loads REST1-LR and REST1-RL
# scans for each subject,
# drops the first 15 volumes, and truncates each scan to approximately 10 minutes.
# Outputs:
# - Priors `.rds` file saved in `dir_results`
source("5_estimate_prior.R")
Running estimate_prior() on the full
"combined" subject list (~350 subjects) takes approximately
27 hours and uses 135 GB of memory.
For an example of how to run estimate_prior() and all
relevant parameters, see Appendix E.
In this section, we visualize both the parcellation maps and the
priors outputs (mean and variance) for each parcellation scheme used in
the study: Yeo17, 15 IC, 25 IC, and 50 IC using the
combined list of subjects.
We also visualize their corresponding functional connectivity (FC) priors.
Script:
8_visualization_Yeo17parcellations.R
This script creates one PNG image per parcel (17 in total), where only the selected parcel is colored and all others are white. The parcellation used is Yeo17, created in Appendix D.
Images are saved in
data-OSF/outputs/parcellations_plots/Yeo17.
Script: 9_visualization_GICAparcellations.R
This script loops over all independent components for each
parcellation dimensionality (nIC = 15, 25, 50) and
generates two images per component:
A cortical surface map (e.g.,
GICA15_IC1.png)
A subcortical view (e.g.,
GICA15_IC1_sub.png)
The resulting images are saved in the following folders:
data_OSF/outputs/parcellations_plots/GICA15/
data_OSF/outputs/parcellations_plots/GICA25
data_OSF/outputs/parcellations_plots/GICA50
Each pair of files corresponds to a specific ICA component and captures its spatial map across brain regions.
Script: 6_visualization_prior.R
This script loads each estimated prior file from
priors_rds/ and plots both the mean and standard deviation
components for all independent components (ICs).
All images are organized into folders by number of ICs, GSR setting, and corresponding list of subjects used, e.g.:
data_OSF/priors_plots/GICA15/combined/GSR/
data_OSF/priors_plots/GICA15/combined/noGSR/
data_OSF/priors_plots/GICA25/LR/noGSR/
data_OSF/priors_plots/Yeo17/RL/GSR/
...
In this section, we present a comparative visual summary of the estimated group-level priors.
For each parcellation type Yeo17, 15 ICs, 25 ICs, and 50 IC, we display:
First and Last Parcellation Map
First and Last Component Mean
First and Last Component Standard Deviation
These summaries are shown in a 2-column grid layout per parcellation to highlight spatial structure and variability.
All images were generated using the scripts:
8_visualization_Yeo17parcellations.R
9_visualization_GICAparcellations.R
6_visualization_prior.R